Inducing criteria for lexicalization parts of speech using the Cyc KB
نویسندگان
چکیده
We present an approach for learning part-of-speech distinctions by induction over the lexicon of the Cyc knowledge base. This produces good results (74.6%) using a decision tree that incorporates both semantic features and syntactic features. Accurate results (90.5%) are achieved for the special case of deciding whether lexical mappings should use count noun or mass noun headwords. Comparable results are also obtained using OpenCyc, the publicly available version of Cyc.
منابع مشابه
Inferring parts of speech for lexical mappings via the Cyc KB
We present an automatic approach to learning criteria for classifying the parts-of-speech used in lexical mappings. This will further automate our knowledge acquisition system for non-technical users. The criteria for the speech parts are based on the types of the denoted terms along with morphological and corpus-based clues. Associations among these and the parts-of-speech are learned using th...
متن کاملLexicalization vs. Vocalization: A Cross-Linguistic Study of Emphasis in English and Persian
Language is a system of verbal elements that makes communication of meaningspossible in the manners the users intend by employing certain linguistic deviceswhich are partly language-specific. Once communicating cross-linguistically, thereis always a risk of negative transfer of techniques or processes from the firstlanguage (L1) to the foreign language (L2). The current study investigates the“e...
متن کاملمدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی
In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...
متن کاملبررسی اثر سایتوتوکسیک عصاره اتانولی گیاه عاقرقرحا بر روی سرطان دهان رده سلولی KB
Background & Aims: Cancer is one of the most causes of mortality worldwide. Products derived from natural plants that induce apoptosis are used for cancer treatment. Therefore investigation of different herbal components for new anti-cancer drug is one of the main research activities throughout the world. This study is the first to investigate the cytotoxic and apoptotic effect of Anacyclu...
متن کاملImproving part-of-speech tagging using lexicalized HMMs
We introduce a simple method to build Lexicalized Hidden Markov Models (L-HMMs) for improving the precision of part-of-speech tagging. This technique enriches the contextual Language Model taking into account a set of selected words empirically obtained. The evaluation was conducted with different lexicalization criteria on the Penn Treebank corpus using the TnT tagger. This lexicalization obta...
متن کامل